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ABSTRACT 

This  thesis  proposes  a  new  analysis/synthesis  procedure  for  speech  and  image 
compression.  The  algorithm  applies  the  discrete  wavelet  transform  to  the  subject  data  in 
order  to  obtain  a  set  of  multiresolution  wavelet  coefficients.  The  wavelet  coefficients  are 
then  encoded  by  using  a  multiresolution  codebook  designed  using  the  generalized  Lloyd 
algorithm.  The  statistical  properties  of  the  wavelet  coefficients  are  utilized  to  determine 
the  number  of  resolution  levels  as  well  as  the  codebook  size  at  each  resolution  level. 
Coding  results  show  that  the  new  procedure  provides  a  significant  improvement  in  the 
quality  of  the  reproduced  data.  The  data  tested  includes  speech,  image,  and  transient 
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INTRODUCTION 


The  recent  emergence  of  wavelet  transforms  has  encouraged  many  researchers  to 
seek  new  application  areas  for  them  (Ref.  10-12].  Since  discrete  wavelet  transforms 
(DWT)  are  similar  to  subband  coding  systems  and  subband  coders  have  been  used  in 
speech  and  image  compression,  wavelets  have  found  immediate  application  in  the 
waveform  coding  area  (Ref.  10-12).  Data  compression  consists  of  converting  a  data 
sequence  into  a  stream  of  relatively  low  bit  rate  data  for  transmission  over  a  digital 
communication  channel  or  storage  in  a  digital  media.  The  DWT  itself  does  not  reduce 
the  total  bit  rate,  but  it  provides  a  set  of  wavelet  coefficients  using  an  algorithm  with 
pyramidal  architecture.  Since  wavelet  coefficients  at  different  resolution  levels  exhibit 
different  statistical  characteristics,  by  designing  a  quantizer  for  each  set,  we  can  reduce 
the  bit  rate  without  significant  degradation  in  quality. 

In  this  thesis,  we  present  a  coding  algorithm  that  combines  the  DWT  and  vector 
quantization  (VQ).  VQ  is  a  method  of  quantizing  a  set  of  data  samples  jointly  as  a 
vector.  The  overall  procedure  is  known  as  an  analysis-synthesis  method  and  is  extensively 
used  in  transform  and  subband  coding  areas.  It  involves  two  steps.  First,  we  use  the 
discrete  wavelet  transform  to  obtain  a  set  of  wavelet  coefficients;  second,  we  vector 
quantize  these  coefficients  by  using  codebooks  designed  for  each  set  of  coefficients.  The 
application  of  wavelet  transforms  to  this  technique  is  relatively  new.  The  DWT  has  been 
applied  to  image  coding  by  combining  scalar  quantization  and  vector  quantization  (Ref. 
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10].  Another  algorithm  has  been  reported  that  uses  lattice  vector  quantization  [Ref  11]. 
The  algorithm  we  present  here  uses  generalized  Lloyd  algorithm  [Ref.  8]  for  vector 
quantization  of  wavelet  coefficients.  We  apply  this  coding  technique  both  to  speech  and 
to  image  data. 

The  remainder  of  this  thesis  is  organized  as  follows.  In  Chapter  n.  we  introduce 
the  basic  principles  of  wavelet  theory.  This  chapter  also  covers  the  algorithms  for  the 
discrete  wavelet  transforms  in  both  one  and  two  dimensions.  Chapter  ID  describes  the 
basic  concepts  of  vector  quantization  and  presents  the  vector  quantization  algorithm  used 
m  this  thesis.  In  Chapters  IV  and  V,  we  combine  these  algorithms  to  develop  coding 
methods  for  speech  and  image  compression.  Chapter  VI  presents  conclusions. 
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n.  WAVELET  TRANSFORMS 


A.  INTRODUCTION 

Currently  wavelet  transforms  are  a  popular  topic  of  research  in  various  signal 
processing  applications.  They  are  viewed  as  a  new  way  to  represent  signals  as  well  as 
a  new  technique  for  time-frequency  analysis.  From  the  multiresolution  point  of  view, 
they  provide  powerful  tools  for  speech  and  image  coding  applications.  In  this  chapter  we 
briefly  discuss  the  basic  wavelet  theory  and  discrete  wavelet  transforms  with  orthogonal 
wavelets.  We  also  explore  the  biorthogonal  case  in  the  last  section. 

B.  WAVELET  THEORY 

Wavelet  transforms  can  be  considered  an  alternative  to  classical  short-time  Fourier 
transforms  (STFT).  The  basic  difference  is  that  wavelet  transforms  use  variable  window 
sizes  that  change  along  the  frequency  range.  The  STFT  analyzes  signals  by  using  a  single 
window  size  [Ref.  1];  consequently,  the  time-frequency  resolution  is  fixed  over  the  entire 
time  frequency  plane  (see  Fig.  2.1(a)).  Wavelet  transforms  provide  an  analysis  method 
known  as  constant-Q  or  constant-relative  bandwidth  analysis.  In  this  method  we  use  a 
family  of  analysis  filters  where  the  time  resolution  increases  with  the  center  frequency 
of  the  filters.  As  shown  in  Fig.  2.1(b),  wavelet  analysis  provides  better  frequency 
resolution  at  low  frequencies  and  better  time  resolution  at  high  frequencies.  In  other 
words,  the  time-frequency  window  is  flexible  such  that  it  automatically  narrows  at  a  high 
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Figure  2.1  Coverage  of  Time-Frequency  Plane  for  (a)  STFT  and  (b)  WT. 

center  frequency  and  widens  at  a  low  center  frequency.  However,  the  area  under  this 
window  is  independent  of  the  center  frequency. 

In  order  to  perform  wavelet  transforms,  a  set  of  basis  functions  called  wavelets  are 
required.  These  basis  functions  are  obtained  from  a  prototype  function,  \l/(t),  called  the 
mother  wavelet,  by  means  of  dilations  and  translations.  The  basis  functions  then  take  the 
form: 


M  “ 


where  a  and  b  are  the  dilation  and  translation  parameters,  respectively.  A  basis  function 


becomes  a  dilated  (low  frequency)  version  of  the  prototype  for  a  large  value  of  a  while 
it  is  a  contracted  (high  frequency)  version  for  a  small  value  of  a.  The  dilation  and 
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contraction  of  the  mother  wavelet  in  the  time  domain  cause  a  corresponding  contraction 
and  dilation  in  the  frequency  domain. 

The  continuous  wavelet  transform  (CWT)  of  x(t)  is  defined  as  the  inner  product  of 
the  form  [Ref.  13]: 

(2.2) 

where  the  asterisk  indicates  complex  conjugation.  Since  the  are  orthogonal,  the 
original  signal  can  be  reconstructed  by  using  the  inverse  relationship: 

x(0  =cff  W^ia,b)\^^j,(t)dadb,  (2-3) 

where  c  is  a  constant  of  proportionality. 

So  far  we  have  mentioned  only  the  wavelet  function,  \j/(i).  For  multiresolution 
analysis,  we  also  need  another  basis  function  called  the  scaling  Junction,  <j)(t).  The 
scaling  function  is  used  to  transform  the  signal  from  a  fine  to  a  coarse  scale  [Ref.  1]; 
moreover,  its  operation  can  be  viewed  as  lowpass  filtering  while  that  of  wavelet  can  be 
considered  as  highpass  filtering.  The  scaling  function  has  the  form: 

0-‘t) 

■  /R  “ 

Equation  2.1  and  2.4  are  similar  in  form. 

1.  Discretization  of  Dilation  and  Translation  Parameters 

The  continuous  wavelet  transform  has  two  drawbacks,  namely,  redundancy 
and  impracticality.  Both  problems  can  be  solved  by  sampling  the  dilation  and  translation 
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parameters.  This  procedure  leads  to  a  family  of  wavelets  and  scaling  functions  with 
discrete  parameters. 

Let  a=a”  and  b=kb,ft,^,  then  \p(t)  and  <j>(t)  defined  in  Eq.  2.1  and  Eq.  2.4 

become: 


=aQ”%(ao”t-kb^), 


(2.5) 


(2.6) 


where  m,  k  are  integers  and  Cg  >  1,  bo  ^  0.  Inserting  Eq.  2.5  into  Eq.  2.2  yields 
wavelet  coefficients  in  the  discrete  form  [Ref.  5]: 


Wjim,k)  =«o  x(t)^iao'"t-kb()dt .  (2*7) 

The  wavelet  family  is  to  be  chosen  so  that  the  original  signal  can  be  uniquely 
reconstructed  (Ref.  1].  In  particular,  they  must  form  an  orthonormal  basis  in  L^(R),  i. 
e.,  they  must  satisfy: 


fl,  m=m',k=k' 


(2.8) 


[0,  otherwise . 

For  the  orthonormality  condition,  it  can  be  shown  that  the  parameter  Op  and  bo  must  be 
chosen  as  a„  =  2  and  bo  =  1  which  imply  the  dyadic  scale  [Ref.  1]. 


Multiresolution  Analysis 


The  idea  of  multiresolution  analysis  was  introduced  by  Mallat  [Ref.  3].  In  his 
study,  he  used  this  approach  to  construct  orthonormal  wavelets.  Assume  that  the  original 
signal  x(t)  is  measurable  and  has  finite  energy.  We  successively  decompose  x(t)  into 


approximation  and  wavelet  coefficients.  This  decomposition  is  done  by  convolving  x(l) 
with  the  scaling  and  wavelet  filters. 

Multiresolution  analysis  consists  of  a  set  of  closed  subspaces 
{V„  I  0>m>M}  G  Lr(R),  where  we  express  the  resolution  level  of  the  original  signal 
at  and  the  lowest  resolution  level  at  m=M  [Ref.  5].  The  subspaces  share  certain 
properties: 

•  Causality  :  The  vector  space  of  resolution  level  m  +  1  is  a  subset  of  the  vector 
space  of  the  upper  resolution  level  (m). 

<- coarser  ...  C  C  V„  C  C  ...  finer  ^ 

•  Completeness:  The  vector  spaces  must  satisfy  the  following  operations: 

c^v„  =  {0}. 

•  Scaling:  The  spaces  of  approximated  signals  can  be  computed  by  scaling  each 
approximated  signal  by  the  ratio  of  their  resolution  values,  i.e.,  if  x(t)  G  V„,  then 
x(2t)  G  V,.,. 

•  Orthogonality:  V„.y  =  V„  ®  W„,  where  W„  is  the  orthogonal  complement  of  V„ 

in  V„.y  ,  i.e.,  1  W„.  In  other  words,  is  equivalent  to  V„  plus  some  added 

detail  corresponding  to  W„.  Furthermore,  this  property  and  the  causality  property 
together  imply  that  the  direct  sum  of  all  W„  spans  L^(R): 

■  ■  ■  «■») 

We  associate  the  scaling  function  (f>(t)  with  V„,  and  the  wavelet  function  \{/(t)  with 
W„.  We  can  write  ^(t)  as: 


<l>(0=253  W4>(2f-n),  (2.10) 

n 


and  similarly  we  can  express  as: 
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ili(f)=252«(«)'l»(2r-n). 


(2.11) 


where  the  coefficient  set  h(n)  is  the  unit  impulse  response  of  a  lowpass  filter  while  g(n) 
is  the  unit  impulse  response  of  a  highpass  filter.  The  filter  g(n)  is  derived  from  h(n)  by: 


g(n)=(-l)-A(l-n). 


(2.12) 


The  filters,  h(n)  and  g(n),  must  satisfy  certain  conditions  for  a  perfect  reconstruction, 
[Ref.  14];  namely: 


'£Hn)  =  l 


(2.13) 


(2.14) 


2^2  h(n-2m)h(n-2k)=b(m-k) . 


(2.15) 


The  wavelet  filter,  g(n),  must  also  satisfy  Eq.  2.14  and  2.15,  but  in  place  of  Eq.  2.13, 
g(n)  must  satisfy: 


(2.16) 


From  the  orthogonality  condition,  h(n)  and  g(n)  together  satisfy  the  following  equations: 


25^  {h{m-2n)h{k-2n)^gim-2n)g{k-2n)\^b{jn-k)  (2.17) 
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(2.18) 


53  h(n  -2k)g(n -2m)  =0 

n 

Equation  2.10  through  2.18  describe  the  essential  properties  of  these  functions. 

C.  DISCRETE  WAVELET  TRANSFORMS 

The  discrete  wavelet  transform  (DWT)  was  developed  for  processing  discrete-time 
signals  [Ref.  3,4].  The  DWT  consists  of  a  series  of  identical  operation  blocks  with  both 
time  and  dilation/translation  parameters  in  discrete  form.  In  this  section,  we  start  with 
the  DWT  for  the  1-D  case  and  then  extend  it  to  the  2-D  case. 

1.  One-Dimensional  DWT 

The  one-half  resolution  approximation  of  a  discrete  signal  x(n)  can  be 
obtained  by  convolving  it  with  a  lowpass  filter  followed  by  subsampling  by  two.  This 
forms  the  basic  procedure  for  decomposing  the  original  sequence  into  lower  resolution 
levels.  At  the  same  time,  convolving  the  sequence  at  each  level  with  a  highpass  filter 
gives  us  the  wavelet  coefficients.  Figure  2.2  depicts  the  pyramidal  decomposition  scheme 
of  the  DWT  while  Figure  2.3  shows  the  operations  at  a  given  resolution  level.  Each 
block  in  Fig.  2.2  performs  the  following  operations: 

•  Convolution  with  the  argument  filter,  h(n)  or  g(n). 

•  Subsampling  (decimation)  by  two. 

•  Normalization  by  >^2. 

The  wavelet  coefficients  are  the  ones  we  need  to  save  in  order  to  reconstruct  the  original 
signal.  We  also  need  to  save  the  approximation  coefficients  of  the  lowest  level,  M.  In 
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other  words,  the  DWT  representation  of  x{n)  consists  of  the  coefficient  set:  {x,^,  d„, 
m  =  l,..  ,M} ■ 

a.  Decomposition 

Given  a  discrete  time  sequence  x(n)  G  L^(R),  the  approximation 
coefficients  are  computed  recursively  from: 

m=0,l,...JVf-l  (2.19) 

k 

where  Jc„(n)  is  the  signal  at  resolution  level  m\  xjn)  is  the  original  signal.  This  operation 
is  equivalent  to  lowpass  filtering  of  xjn)  followed  by  subsampling  by  two.  Consider  an 
FIR  structure  for  the  filter,  h(n).  The  filtering  operation  in  Eq.  2.19  can  be  represented 
as  a  matrix  vector  product: 

where 


h{L-\)  hiL-2)  hiO)  0  0 

0  0  A(L-l)  hil)  hil)  KO) 


(2.21) 


and  x„  =  [  xJ-1),  xJO),  xjl),  ]^.  The  matrix,  H,  must  satisfy  the  condition  [Ref.  4]: 


(2.22) 
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Figure  2.2  General  Decomposition  Algorithm  of  DWT 


d  ,(n) 

m+l 


Figure  2.3  One-Step  Decomposition  Algorithm 
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where  /  is  the  identity  matrix,  and  (.)^  stands  for  the  transposition  operation.  Notice  the 
similarity  between  Eq.  ?  22  and  Eq.  2.14.  Using  the  same  algorithm  with  the  wavelet 
filter  gives  us  wavelet  coefficients  of  lower  resolution  level: 


gik-2n)xjk) ,  (2.23) 

k 

where  the  coefficient  set  g(n)  is  defined  by  Eq.  2. 12.  If  we  build  a  matrix,  G,  in  a  form 
analogous  to  Eq.  2.21,  we  have: 

2GG^=/,  <2.24) 

and 


HG^=0 


(2.25) 


which  follows  from  Eq.  2.18 
b.  Reconstruction 

We  know  that  the  subspaces  in  which  the  signals  are  represented  satisfy 
K.  /  =  K.  ®  ^m-  This  implies  that  we  can  reconstruct  each  higher  level  by  adding  the 
projections  from  the  two  orthogonal  subspaces  of  the  current  level.  The  projections  of 
a  sequence  onto  and  W„  are  given  by: 


i=2H^Hx^  and  d=2G^Gx.  (2.26) 

m  in  mm 

From  Eq.  2.9  and  2.25  the  projections  are  orthonormal  and  span  complete  subspaces. 
Thus,  from  Eq.  2.26  and  the  relation  x„  =  x„  +  d„,  we  have: 
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(2.27) 


The  reconstruction  algorithm  is  depicted  in  Fig.  2.4  and  has  the  form: 

^  g{n-2k)dj)c)\.  (2.28) 

* 

This  formula  is  derived  as  follows,  substituting  Eq.  2.19  and  2.23  into  Eq.  2.28  and 
taking  Eq.2.17  into  consideration,  we  observe  that  the  righthand  side  of  Eq.  2.28 
becomes: 


v/553  [h(n~2k)xjk)^g{n-2k)djjc)\ 

k 

=v/25;  {hin-2k)[^  h(j-2k)x^_,{ri\^g{n-2k)[^Y. 

*  >  j 

=25;  Y.  {Kn-2k)h{j-2k)-g{n-2k)g{j-2k)]x^_^{j) 

*  ; 

(2.29) 


This  completes  the  set  of  formulas  needed  for  the  one-dimensional  DWT. 


2.  Two-Dimensional  DWT 

We  now  extend  the  one-dimensional  DWT  to  the  two-dimensional  case.  This 
can  be  done  by  using  multidimensional  extensions  of  wavelets.  Assume  that  we  have  a 
set  of  subspaces  Vj  satisfying  the  multiresolution  conditions  described  in  Section  B.2. 
Then,  we  can  define  a  vector  space  V„  as  the  tensor  product  of  these  subspaces  in  Lr(R}): 


V  <2.30) 

m  m  m 

If  is  a  one-dimensional  scaling  function  of  Vj,  then  we  can  derive  the 

corresponding  separable  two-dimensional  scaling  function  as: 
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Figure  2.4  One-Step  Reconstruction  Algorithm. 


<I»(f,,r2)=(j)(r,)(l){f2).  (2.31) 

Similarly,  two-dimensional  wavelets  can  be  obtained  as: 

Y‘(f,,r2)=<|)(r,)i|f(r2),  (2.32) 

T^(f,,t2)=i|»(t,)4)(f2),  (2.33) 


These  functions  are  orthogonal  to  each  other  (follows  from  the  orthogonality  of  one¬ 
dimensional  components).  Table  2. 1  summarizes  the  two-dimensional  basis  functions  and 
coefficients  as  well  as  their  frequency  characteristics.  [Ref.  1 ,  3] 
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Table  2.1  SUMMARY  OF  BASIS  FUNCTIONS. 


Basis  Coefficients  Frequency 

Functions  Characteristics 


Vertical  Lowpass 
Horizontal  Lowpass 

d‘.„(n,,n2) 

Vertical  Lowpass 
Horizontal  Highpass 

d^„(n„m) 

Vertical  Highpass 
Horizontal  Lowpass 

dUn,,n.) 

Vertical  Highpass 
Horizontal  Highpass 

a.  Decomposition 

We  can  obtain  the  approximation,  xjn,,n2)^  and  wavelet  coefficients, 
dJniMz),  i  =  1.2,3,  by  convolving  the  approximation  coefficients  of  higher  resolution 
level  by  the  horizontal  and  vertical  scaling  and  wavelet  filters; 


E  ^„(WAa-2n2)/i,(fc-2n,),  (2.35) 

*  I 

dLl{n^,n^)=2Y,  E  (2.36) 

k  I 

dL\ini,n^)=2Y,  E  xjk,l)g^il-2n^)h^ik-2n^),  (2.37) 

*  / 

dLM^,n,)=2Y,  E  ^J<f^'^Sh^l-2n^)g^{k-2n^)  (2.38) 

k  I 


The  procedure  for  2-D  decomposition  is  depicted  in  Fig.  2.5.  Here,  we  have  two  levels 
of  l  -D  decomposition;  one  is  convolution  with  a  horizontal  basis  filter  in  the  horizontal 
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direction  (for  example,  rows  of  an  image)  and  the  second  is  convolution  with  a  vertical 
basis  filter  in  the  vertical  direction  (for  example,  columns  of  an  image). 

b.  Reconstruction 

The  reconstruction  algorithm  follows  the  same  procedure  of  matrix 
multiplication  in  horizontal  and  vertical  directions  as  in  the  decomposition  scheme  (see 
Fig.  2.6).  In  analogous  to  Eq.  2.25  and  2.27,  we  have: 


(2.39) 


The  reconstruction  algorithm  in  Fig.  2.6  can  be  expressed  in  matrix  notation  as: 


(2.40) 


Substituting  Eq.  2.35  through  2.38  into  Eq.  2.40  and  using  the  property  of  Eq.  2.39,  we 
can  show  that: 


.2CX[fl,C,r J  ,2,41, 

*2hJg; [GM  ^2gJg; [G,C/ J 


The  quality  of  reconstruction  in  both  the  1-D  and  2-D  cases  depends  on 
the  precision  of  the  filter  coefficients.  If  we  use  the  Haar  filter  (  h=[0.5  0.5J),  we  do  not 
introduce  any  reconstruction  errors.  For  other  filters,  such  as  Daubechies  filters,  it  is 
possible  to  have  some  errors  in  the  reconstructed  signal  because  these  filter  coefficients 
require  infinite  precision  [Ref.  2].  In  practice,  however,  ‘he  resulting  error  is  still 
insignificant  (around  10“’  for  our  signals). 


16 


HORIZONTAL 


VERTICAL 


Figure  2.5  Decomposition  Algorithm  of  2-D  DWT. 


VERTICAL  HORIZONTAL 

Figure  2.6  Reconstniction  Algorithm  of  2-D  DWT. 
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D.  BIORTHOGONAL  WAVELETS 

We  prefer  to  use  short  filters  for  fast  computation  and  because  fewer  redundant 
samples  are  caused  by  the  convolution.  Short  filters  do  not  provide  adequate  smoothness 
however.  For  image  analysis,  linear  phase  filters  are  preferred  [Ref.  4].  In  order  to  have 
filters  of  arbitrary  length  with  linear  phase,  we  have  to  sacrifice  orthogonality  [Ref.  4]. 
The  solution  to  this  problem  is  to  use  filters  based  on  a  sp)ecial  class  of  functions  called 
biorthogonal  bases. 

Biorthogonality  requires  use  of  different  filters  for  analysis  and  synthesis  operations 
with  the  decomposition  and  reconstruction  algorithms  described  in  Section  B  of  this 
chapter.  Figure  2.7  shows  a  one  step  decomposition  and  reconstruction  scheme  using 
biorthogonal  filters.  The  conditions  for  perfect  reconstruction  are  satisfied  by  imposing 
orthogonality  separately  over  the  decomposition  and  reconstruction  stages  [Ref.  4].  Using 
the  matrix  notation  we  have  used  in  Eq.2.21,  v.'e  have: 

H,Gj=G^,=0,  (2.42) 

2//j^j=2GoG,=/  (2.43) 

Given  the  analysis  and  synthesis  scaling  filters,  hjn)  and  h,(n),  the  corresponding 
wavelet  filters  are  expressed  as: 
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Figure  2.7  DWT  with  Biorthogonal  Wavelets. 

<2.44) 

<2.45) 

Notice  the  similarity  between  Eq.  2.25  and  Eq.  2.42,  and  Eq.  2.22  and  Eq.  2.43  (in  the 
orthogonal  case  H,=Hj  and  G,=Gj).  Finally,  perfect  reconstruction  in  biorthogonal 
bases  requires: 

(2.46) 

The  biorthogonal  DWT  algorithm  for  one-dimensional  signals  can  be  easily 
extended  to  the  two-dimensional  case  by  using  biorthogonal  filters  in  the  vertical  and 
horizontal  directions. 
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in.  VECTOR  QUANTIZATION 


A.  INTRODUCTION 

Vector  quantization  (VQ)  is  generalization  of  scalar  quantization  where  a  scalar 
value  is  represented  by  one  of  several  discrete  levels.  In  vector  quantization  a  vector 
quantity  is  represented  by  one  of  several  other  fixed  vectors  which  are  close  to  the 
original  vector  in  some  sense.  According  to  Shannon’s  rate  distortion  theory,  better 
results  are  always  obtained  when  vectors  instead  of  scalars  are  encoded  [Ref.  15].  As  the 
vector  dimension  becomes  large,  VQ  can  approach  the  rate  distortion  limit  for  the 
problem.  In  typical  applications,  bit  rates  of  lower  than  one  bit  per  sample  are  achievable 
with  vector  quantization;  these  rates  are  unlikely  for  scalar  quantization. 

B.  QUANTIZER  DESIGN 

Vector  quantizers  map  vectors  in  a  multidimensional  space  into  a  finite  set  of 
reproduction  vectors  called  the  codebook.  The  codebook  is  then  used  to  encode  and 
decode  the  data  set.  Figure  3.1  shows  a  schematic  of  a  basic  vector  quantizer.  The 
encoder  quantizes  blocks  of  the  discrete  time  signal  x(n)  by  using  the  codebook 
constructed  previously  and  encodes  it  into  a  sequence  of  bits  {b),  which  is  then 
transmitted  through  the  transmission  channel  or  stored  in  some  storage  medium.  If  there 
are  no  channel  errors,  then  {^/}  =  {/?,}.  At  the  receiver  end,  a  decoder  converts  the 
sequence  of  bits  {/?,  ’}  into  a  vector  representing  blocks  of  data  and  thus  into  the  signal. 
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Figure  3.1  Basic  Structure  of  Vector  Quantization. 

s(n).  The  output  s(n)  is  the  reconstructed  signal,  which  will  be  an  approximation  of  the 
input  signal  x(n). 

Given  the  basic  structure  of  VQ,  we  now  proceed  to  discuss  the  details  of  VQ. 

Assume  that  x  =  [Xi,  Jtj . Xf^Y  is  an  N-dimensional  vector  whose  components 

{x^,  1  <k<N}  are  real  and  continuous  (amplitude)  random  variables.  Specifically,  the  jc* 
represent  data  samples  in  a  block  of  data  from  the  signal  x(n).  A  vector  quantizer  maps 
X  into  another  real-valued,  discrete-amplitude,  N-dimensional  vector  y.  This  operation 
can  be  described  as: 

y=g(x),  <3.1) 

where  q(.)  is  the  quantization  operatoi.  The  reproduction  vector  is  chosen  from  a  set 
of  vectors  M={y„  l<i<L},  where  ^<=0,/,  y,2,  .  .  •  ,  The  set  M  is  called  the 
codebook,  where  L  is  the  size  of  the  codebook.  The  quantity 
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/?=Iog2l 


(3.2) 


is  the  rate  of  the  quantizer  in  bits  per  vector  while  r=R/N  is  the  rate  in  bits  per  sample 
(  or  bits  per  pixel  if  the  input  is  image  data).  To  design  a  codebook,  we  partition  the  N- 
dimensional  space  of  input  vectors  x  into  L  cells  {Q.  <L}  and  associate  a  vector  3'^ 

with  each  cell  Q.  Then  we  use  the  code  vector  3;^  to  represent  vectors  that  fall  within  cell 
C,,  that  is: 

^(x)=y-,  ifxeC..  (3.3) 

The  vector  quantizer  is  said  to  be  a  minimum-distortion  quantizer  if  some  measure 
of  distortion  is  minimized  over  all  code  vectors.  The  measure  of  distortion  is  defined  in 
terms  of  a  distortion  function,  d(x,y),  which  represents  the  cost  of  representing  any  input 
vector  X  by  a  code  vector  y'.  Although  there  are  many  possible  distortion  measures  [Refs. 
6,  8],  we  choose  the  mean-square  error  (MSE): 

^  H  ~ yif' 

due  to  its  simplicity  and  mathematical  tractability.  This  distortion  measure  is  simply  the 
squared  Euclidean  distance  between  x  and  y. 

To  design  an  optimal  (minimum-distortion)  quantizer  using  this  distortion  function, 
there  are  two  necessary  conditions  [Ref.  7].  First,  the  quantizer  must  satisfy  the  nearest 
neighbor  condition: 
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qi.x)=y^,  iff  d(x,y)^d{x,yp,  j*i,  (3.5) 

That  is,  for  each  /,  all  input  values  that  are  closer  to  code  vector  y,  than  to  any  other 
code  vector  should  be  assigned  to  ceil  C,.  Therefore: 

d(x,q(x))=B^{d(x,y)}.  (3.6) 

y,eM 

Secondly,  the  code  vectors  y,  must  satisfy  the  centroid  condition;  each  code  vector  y,  is 
chosen  such  that  y^  minimizes  the  MSE  in  cell  C,.  In  this  case,  it  is  easily  shown  that: 

y.=cent(C).  (3-7) 

where: 

1  ^ 

cenr(C.)  =  ^53ac*  (3-8) 

for  all  of  the  vectors  {jc*,  k=l ,  2,  .  .  .  ,  K}  contained  in  cell  Q. 

Based  on  the  conditions  defined  above,  an  iterative  procedure  can  be  defined  to 
optimize  the  quantizer  starting  from  an  arbitrary  set  of  code  vectors.  The  iteration  takes 
place  until  convergence  is  obtained.  The  algorithm  used  is  the  generalized  Lloyd  (GL) 
algorithm  [Ref  8]  also  known  as  the  LBG  (Linde,Buzo,Gray)  algorithm  [Ref  9].  The 
algorithm  is  detailed  as  follows  : 

•  Step  1  (Initialization);  Set  m  =  i.  Choose  an  initial  codebook  M,. 

•  Step  2  (Classification):  Given  the  codebook  M,,  classify  the  set  of  training  vectors 
{x*.  1  <k^L}  into  cells  C,  according  to  the  nearest  neighbor  condition. 

•  Step  3  (Updating);  Update  the  code  vector  y.  of  each  cell  C,  by  the 

centroid  condition  (see  Eq.  3.7). 
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•  Step  4  (Termination  Test):  Compute  MSE  for  If  it  has  changed  by  an 

amount  less  than  a  prespecified  value,  stop;  otherwise,  go  to  Step  2. 

A  flow  chart  for  this  algorithm  is  shown  in  Fig.  3.2. 

The  algorithm  above  requires  an  initial  codebook  for  the  first  step.  There  is  a 

variety  of  techniques  available  for  generating  the  initial  codebook  (see  Refs.  6  and  8  for 

different  methods).  In  this  work,  we  choose  the  random  coding  technique.  Given  a  set 

of  training  vectors,  this  method  randomly  selects  L  code  vectors.  Once  the  initial  code 

vectors  are  selected,  the  GL  algorithm  is  performed  to  improve  the  codebook.  If  an 

initial  code  vector  corresponds  to  a  cell  where  there  are  no  training  vectors,  the  empty 

cell  problem  arises.  In  this  thesis  we  assume  that  a  cell  is  empty  if  it  has  three  or  fewer 

training  vectors.  If  an  empty  cell  arises,  we  split  the  code  vector  with  the  highest  number 

of  training  vectors  into  two  code  vectors  for  that  particular  cell  and  repartition  the  cell 

into  two  cells.  The  empty  cell  is  then  discarded.  This  algorithm  forms  Step  2 

(classification)  of  the  GL  iteration  (see  Fig.  3.3).  The  criterion  to  terminate  the  iteration 

in  Step  4  (Figure  3.2)  is  based  on  the  change  in  MSE,  which  is  computed  as: 

D  -D  , 

A(Af5£)  =  — = — — .  (3  *») 

where  D„=d(x,y)  for  iteration  m. 

The  quantization  procedure  described  here  is  known  as  Jiill  search  VQ  (FSVQ) 
since  all  code  vectors  are  tested  for  quantizing  each  input  vector.  Other  methods  of  VQ 
exist  (see  Refs.  6  and  8),  but  were  not  explored  in  this  thesis. 
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Initialization 

m«l 


Stop 


Figure  3.2  Flow  Chart  for  Generalized  Lloyd  Algorithm. 
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Empty  Cell  ' 
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1 

Figure  3.3  Modification  of  GL  Algorithm  to  Avoid  the  Empty  Cell  Problem 
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rv.  ALGORITHM  DEVELOPMENT  FOR  SPEECH  DATA  AND  ANALYSIS 


A.  INTRODUCTION 

The  previous  two  chapters  provide  an  overview  of  discrete  wavelet  transform  and 
vector  quantization.  In  this  chapter  we  combine  these  concepts  to  develop  algorithms  for 
sp>eech  waveform  compression.  We  also  include  a  test  case  for  transient  signal 
compression  in  the  last  section. 

The  method  used  here  is  known  as  analysis-synthesis  coding.  It  has  been  commonly 
used  in  subband  coding  and  other  transform  coding  techniques  [Ref.  8,  16].  This  method 
analyzes  the  signal  into  components  that  in  some  sense  provide  a  more  primitive 
representation  of  the  signal,  and  quantizes  these  components.  The  quantized  components 
are  then  used  to  synthesize  a  reproduction  of  the  original  signal.  Figure  4.1  illustrates 
the  basic  steps  of  this  algorithm.  Notice  that  Figure  4.1  is  a  modified  version  of  Fig.  3.1, 
where  we  directly  quantize  the  original  signal  without  decomposing  it  into  components. 

In  this  study,  the  DWT  is  used  for  analysis-synthesis  operations  while  vector 
quantization  is  the  method  chosen  for  encoding  the  wavelet  coefficients.  The  DWT 
provides  a  set  of  wavelet  coefficients  for  each  resolution  level.  Each  set  of  coefficients 
has  different  statistical  characteristics,  and  by  designing  a  quantizer  for  each  coefficient 
set  optimally,  the  quantization  error  can  be  reduced.  This  error  reduction  at  each  level 
results  in  an  improvement  in  the  quality  of  reproduction  of  the  original  data. 
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Figure  4.1  Analysis-Synthesis  Coding  Scheme. 


In  the  following  sections  we  develop  algorithms  for  speech  data  based  on  the 
analysis-synthesis  coding  scheme.  We  also  present  the  experimental  results  of  the 
algorithm  and  comparisons  of  this  method  with  direct  vector  quantization  (DVQ). 


B.  ALGORITHM  DEVELOPMENT 

The  analysis  stage  for  speech  data  consists  of  a  four-level  DWT  decomposition. 
Here,  DWT  decomposes  the  speech  data  into  four  sets  of  wavelet  coefficients,  {d„, 
m  =  4},  and  one  set  of  approximation  coefficients,  defined  with  Eq.  2.23  and 

Eq.  2.  19,  respectively.  From  this  point  on  we  call  the  combined  set  of  coefficients 
and  d„„  m  =  4}  the  wavelet  coefficients  for  convenience.  Figure  4.2  shows  the 
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Frequency 


Figure  4.2  Division  of  Frequency  Domain  for  DWT. 

division  of  the  frequency  spectrum  for  each  resolution  level.  At  each  resolution  level,  the 
lower  band  is  divided  into  low  and  high  frequency  bands  and  subsampled  in  the  time 
domain  by  two  as  explained  in  Chapter  H,  Section  C.  The  scaling  filter,  h(n),  that  we 
have  chosen  for  the  DWT  is  Daubechies  10-tap  filter  listed  in  Ref.  2.  The  wavelet  filter, 
g(n),  is  derived  from  the  scaling  filter  by  Eq.  2.12. 

The  next  step  after  obtaining  the  wavelet  coefficients  is  to  construct  the  codebooks 
for  each  set.  The  sizes  of  the  codebooks  are  determined  by  the  number  of  bits  allocated 
for  each  codebook.  The  bit  allocation  is  a  major  concern  in  the  design  of  a  coding 
system.  It  consists  of  distributing  a  given  quota  of  bits  to  various  quantizers  in  order  to 
optimize  the  overall  coder  performance  [Ref.  8]. 
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1.  Optimum  Bit  Allocation 

Fewer  bits  can  be  allocated  to  the  high  frequency  components  than  to  the  low 
frequency  components  since  the  corresponding  coefficients  usually  have  a  smaller 
variance.  It  is  also  possible  to  discard  all  coefficients  at  a  given  resolution  level,  m,  if 
the  normalized  energy; 


E 


E 


(4.1) 


in  that  level  is  less  than  a  predetermined  threshold  value.  The  threshold  value  is 
determined  such  that  a  coefficient  set  with  low  energy  level  does  not  have  a  significant 
contribution  to  the  reproduction  of  the  original  data.  Discarding  some  coefficient  sets  by 
this  approach  allows  us  to  allocate  more  bits  to  encode  the  coefficients  at  other  levels 
with  significant  energy  values. 

While  the  energy  E(dJ  is  used  as  a  threshold  to  select  the  desired  wavelet 
coefficient  sets,  the  variance  of  the  corresponding  wavelet  coefficients  is  used  in  bit 
allocation  calculations.  Let  B  represent  the  desired  average  number  of  bits  per  vector, 
and  aj  be  the  variance  of  the  wavelet  coefficients  at  resolution  level  m.  Then  the  optimal 
bit  assignment  is  given  by  [Ref.  8]: 


1 

— - 

m  ^ 


(4.2) 
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where 


p^=(noi)'/*  (4.3) 

m=l 

is  the  geometric  mean  of  the  variances  of  the  wavelet  coefficients,  and  the  parameter 
is  defined  as  the  ratio  of  the  number  of  coefficients  at  resolution  level  m,  L„,  to  the 
number  of  samples  in  the  original  signal,  L,;. 


The  number  of  coefficients,  L„,  at  resolution  level  m  is  given  as: 


(4.4) 


2.  Vector  Quantization  of  Wavelet  Coefficients 

The  energy  distribution  of  speech  data  over  the  wavelet  coefficients  has 
certain  similarities.  For  a  four-level  DWT  decomposition,  the  energy  is  typically 
concentrated  in  the  wavelet  coefficients  at  resolution  levels  3  and  4,  dj  and  while  the 
wavelet  coefficients  d,  and  have  relatively  less  energy.  Figure  4.3  shows  the  energy 
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Figure  4.3  Energies  of  Wavelet  Coefficients. 

distribution  of  wavelet  coefficients  of  a  sample  speech  data.  As  the  energy  levels  of  d, 
and  JC4  are  low  compared  to  those  of  dj  and  we  discard  the  wavelet  coefficients  d,  and 
X4  by  replacing  them  with  zeros.  (The  threshold  was  set  at  E(dJ -0.06.)  The  quality  of 
the  reproduced  speech  indicates  that  these  coefficients  {d,  and  X4)  do  not  have  a 
significant  contribution  in  the  reproduction. 

Having  determined  the  coefficients  to  encode,  there  is  another  problem  to  be 
resolved:  the  small  size  of  training  sets  at  lower  resolution  levels.  Equation  4.5  indicates 
that  the  total  number  of  samples  decreases  by  approximately  a  factor  of  two  with  each 
resolution  level;  furthermore,  this  implies  that  the  size  of  the  training  set  for  that 
particular  resolution  level  shrinks  accordingly.  One  way  to  compensate  for  this  effect  is 
to  use  a  large  number  of  samples  in  the  original  data.  However,  in  some  cases,  such 
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large  data  sets  may  not  be  available.  Instead,  we  use  the  following  technique  to  extend 
the  data  sets  at  low  resolution  levels.  Consider  the  resolution  level  m,  where  we  have 
roughly  half  the  number  of  samples  as  at  resolution  level  m-1  after  carrying  out  the 
DWT.  We  now  shift  the  data  at  level  m-1  by  one  sample  and  repeat  the  DWT 
decomposition  for  the  shifted  data.  This  gives  us  a  second  set  of  coefficients  for 
resolution  level  m.  By  combining  the  two  sets  of  coefficients,  we  double  the  number  of 
coefficients  at  resolution  level  m.  The  resulting  increase  in  the  training  set  size  offsets 
the  decrease  caused  by  subsampling. 

The  next  step  is  to  construct  the  codebooks  by  using  the  training  data  sets 
obtained  by  the  procedure  outlined  above.  The  sizes  of  codebooks  are  determined  by  the 
bit  allocation  algorithm  described  in  Eq.  4.2.  The  wavelet  coefficients  at  selected 
resolution  levels  are  then  encoded  by  using  the  corresponding  codebooks.  In  order  to 
reconstruct  the  original  data,  we  decode  the  wavelet  coefficients  by  using  the  same 
codebooks  and  take  the  inverse  DWT.  Here,  we  substitute  zeros  for  the  discarded 
wavelet  coefficients. 

C.  SIMULATION  RESULTS 

In  this  section  we  study  the  performance  of  the  wavelet  transform  coding  (WTC) 
using  speech  data.  Simulations  were  carried  out  by  using  two  data  sets: 

•  A  training  sequence  of  104100  samples  of  ordinary  speech  from  three  different 
speakers  sampled  at  8  kHz. 

•  A  test  sequence  of  20330  samples  from  a  speaker  not  in  the  training  sequence. 
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The  test  sequence  was  also  encoded  by  DVQ  for  comparison.  We  kept  the  average  bit 
allocation  at  B=2  bits  per  sample  (bps)  (see  Eq.  4.2)  and  changed  the  vector  size  to 
obtain  different  bit  rates.  The  codebook  sizes  for  wavelet  coefficients  are  listed  in 
Table  4.1.  The  distortion  criterion  defined  in  Eq.  3.9  for  the  GL  algorithm  was  set  at 
0.01.  The  performance  of  the  quantization  was  measured  by  using  a  signal-to-noise  ratio 
(SNR)  defined  as: 


5N/?  =  101og,o 


E{x^) 

D 


dB 


(4.6) 


where  E(r)  is  the  signal  power  and  D  is  the  mean-square  error  between  the  original  and 
reconstructed  signal.  The  signal  power  is  the  estimated  variance,  o/,  of  signal  samples 
assuming  that  the  input  has  zero  mean. 

Table  4.1  CODEBOOK  SIZES  FOR  WAVELET  COEFHCIENTS 


Wavelet  Coefficients 

d, 

d2 

ds 

d. 

X4 

Codebook 

Size 

0 

8 

64 

512 

0 

The  performance  of  WTC  in  terms  of  SNR  vs.  bit-rate  (bits  per  sample)  is 
plotted  in  Fig.  4.4  as  the  solid  line.  The  dotted  line  represents  the  performance  of  DVQ. 
Clearly,  WTC  is  superior  to  DVQ  with  a  SNR  improvement  of  between  19%  and  55  %, 
from  0.18  bps  to  1.0  bps.  Figure  4.5  shows  a  segment  of  original  test  sequence. 
Figure  4.6  shows  an  encoded  example  with  a  bit  rate  of  0.18  bps  and  an  SNR  of  3dB. 
The  overall  quality  is  fairly  good  and  the  speech  is  understandable.  The  same  speech 
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10 


Figure  4.4  SNR  vs.  Bit  Rate:  Speech  Data.  The  solid  line  represents  WTC  while  the 
dotted  line  represents  DVQ. 

segment  encoded  by  DVQ  is  shown  in  Fig.  4.7.  The  bit  rate  was  0. 18  bps  and  SNR  was 
2.5dB.  Although  the  SNR  values  of  WTC  and  DVQ  are  fairly  close,  the  smoothing 
capability  of  WTC  renders  the  speech  more  understandable.  Figures  4. 8-4. 9  show  the 
reproduced  speech  segments  encoded  at  bit  rates  of  1.03  bps  for  WTC  and  1  bps  for 
DVQ.  Table  4.2  summarizes  the  bit  rates  and  the  contribution  of  each  wavelet  coefficient 
set  to  the  overall  distortion. 
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Figure  4.8  Test  Speech  Coded  by  WTC  at  1.03  bps,  SNR  =  9.56  dB. 


Figure  4.9  Test  Sj>eech  Coded  by  DVQ  at  1.0  bps,  SNR  =  6.2  dB. 
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Table  4.2  BIT  ALLOCATIONS  AND  MSE  CONTRIBUTIONS  OF  WAVELET 
COEFFICIENTS  FOR  AN  AVERAGE  BIT  RATE  OF  1.03  BPS. 


Wavelet  CoefTicients 

d, 

d2 

d, 

d. 

X4 

Bit  Rate  (bps) 

0 

0.3760 

0.3774 

0.2853 

0 

MSE 

0.(X)06 

0.0021 

0.0067 

0.0020 

0.0154 

Interestingly,  given  an  average  number  of  bits,  B,  the  performance  of  WTC 
increases  rapidly  over  the  performance  of  DVQ  when  we  use  smaller  vector  dimensions. 
(Recall  that  we  use  larger  vectors  to  obtain  small  bit  rates  in  Fig.  4.4.)  This  is  caused 
by  the  effect  of  correlation  between  samples.  At  lower  levels,  the  wavelet  coefficients 
become  less  correlated  and  result  in  less  degradation  of  quality  for  small  vector 
dimensions. 

We  have,  additionally,  tested  the  WTC  algorithm  for  coding  transient  signals.  For 
the  simulation  study,  a  transient  signal  of  2400  samples  was  utilized.  Since  the  energy 
levels  of  the  wavelet  coefficients,  X4  and  were  low  compared  to  those  of  d„  and 
d^,  a  three- level  DWT  decomposition  was  chosen  to  analyze  the  original  signal.  Among 
the  four  sets  of  wavelet  coefficients  and  i=l,  2,  3),  Xj  was  discarded  due  to  its 
low  energy  level  (see  Table  4.3),  and  the  bit  allocation  algorithm  in  Eq.  4.2  was  applied 
to  the  coefficients,  d,,  d^  and  dj,  keeping  the  average  number  of  bits  at  B  =  2.  Table  4.3 
shows  the  codebook  sizes  for  the  different  resolution  levels  as  well  as  the  energy  levels 
of  the  wavelet  coefficients  at  those  resolution  levels. 
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Table  4.3  ENERGY  LEVELS  AND  CODEBOOK  SIZES  OF  WAVELET 
COEFnCIENTS  OF  TRANSIENT  SIGNAL 


Wavelet  Coefficients 

d. 

d3 

*3 

Energy 

0.0690 

0.8152 

0. 1030 

0.0140 

Codebook  Size 

2 

16 

8 

0 

As  in  the  case  of  speech  waveform  coding,  the  test  signal  was  coded  by  both  WTC 
and  DVQ  to  allow  good  comparisons.  The  vector  dimensions  were  changed  to  achieve 
varying  bit-rates  in  terms  of  bits  per  sample.  The  original  signal  is  shown  in  Fig.  4. 10. 
Figure  4. 1 1  shows  the  signal  coded  by  WTC  at  1.0  bps.  The  same  test  signal  coded  by 
DVQ  at  1.0  bps  is  displayed  in  Fig.  4.12.  The  SNR  of  the  WTC-coded  signal  was  7.1 
dB  while  the  SNR  in  case  of  DVQ  was  only  3.9  dB.  The  WTC  reproduces  the  transient 
signal  better  because  it  decomposes  the  original  signal  into  different  resolution  levels 
corresponding  to  different  frequency  bands.  Also,  the  DVQ  does  not  contain  a  sufficient 
number  of  code  vectors  to  represent  the  changing  structure  of  the  transient  signal  well. 
The  improvement  realized  by  WTC  is  confirmed  by  the  SNR  performance  displayed  in 
Fig.  4.13. 

For  the  two  test  cases  (speech  waveform  and  transient  signal  coding),  the 
improvement  in  the  performance  of  WTC  over  DVQ  in  terms  of  SNR  varied  from  19  to 
84  percent.  As  stated  before,  this  advantage  is  a  result  of  utilizing  different  codebooks 
to  quantize  the  wavelet  coefficients  at  different  resolution  levels  and  also  of  discarding 
components  that  do  not  contribute  to  the  reproduction  of  the  original  signal. 
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Figure  4.13  SNR  vs.  Bit  Rate:  Transient  Signal.  The  Solid  Line  Represents  the 
WTC,  and  the  Dotted  Line  Represents  DVQ. 
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V.  ALGORITHM  DEVELOPMENT  FOR  IMAGE  DATA  AND  ANALYSIS 


A.  INTRODUCTION 

In  this  chapter  we  extend  the  algorithm  developed  in  the  previous  chapter  to  the 
two-dimensional  case.  The  algorithm  consists  of  the  same  operations  illustrated  in 
Fig.  4.1.  The  following  sections  provide  a  detailed  overview  of  the  algorithm  and  the 
simulation  results. 

B.  ALGORITHM  DEVELOPMENT 

The  analysis  step  of  the  proposed  coding  method  includes  a  three- level  DWT 
decomposition  in  two-dimension  as  illustrated  in  Fig.  5. 1 .  A  three-level  instead  of  a  four- 
level  decomposition  as  used  in  the  speech  waveform  coding  was  chosen  because  the 
correlation  between  coefficients  of  an  image  at  lower  resolution  levels  decreases 
significantly,  and  this  makes  encoding  the  wavelet  coefficients  at  those  levels  extremely 
difficult.  Table  5.1  shows  the  varian».es  of  the  wavelet  and  approximation  coefficients  at 
different  resolution  levels.  Notice  the  increase  in  variance  of  at  the  fourth  resolution 
level.  One  way  to  compensate  the  effect  of  increase  in  variance  would  be  to  use  more 
bits  in  that  particular  resolution  level.  But,  using  more  bits  increases  the  computational 
cost  involved  in  generating  the  codebook  and  finding  the  closest  codeword  for  each 
vector;  thus,  we  stop  the  decomposition  at  the  third  resolution  level. 
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Figure  5.1  Three- level  DWT  Decomposition. 


Table  5.1  VARIANCES  OF  WAVELET  COEFFICIENTS  (10") 


As  we  mentioned  in  Section  D  of  Chapter  O,  we  would  like  to  have  short  wavelet 
filters,  and  we  also  prefer  to  use  filters  with  linear  phase  characteristics.  By  relaxing  the 
orthogonality  and  choosing  biorthogonal  basis  functions,  it  is  possible  to  use  linear  phase 
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filters.  Throughout  the  following  experiments,  we  used  the  7-tap  biorthogonal  filters 
listed  in  Table  5.2  which  are  short  and  have  linear  phase. 

Table  5.2  7-TAP  BIORTHOGONAL  HLTER  COEFHCIENTS 


n 

Analysis  Filter  (ho) 

Synthesis  Filter  (h,) 

0 

0 

0.607142857143 

±1 

-0.05 

0.260714285714 

±2 

0.25 

-0.053571428571 

±3 

0.60 

-0.010714285714 

1.  Optimum  Bit  Allocation 

The  three-level  DWT  provides  10  sets  of  wavelet  coefficients  d^,  m,  i 
=  1.  2,  3}  for  quantization.  Figure  5.2  shows  the  average  energy  distribution  of  wavelet 
coefficients  of  sample  images;  the  energy  of  Xj  is  not  shown  to  scale  in  order  to  make 
other  energy  bars  visible.  (The  actual  value  forxj  is  shown  on  top  of  the  corresponding 
bar.)  As  can  be  seen,  most  of  the  energy  is  concentrated  in  the  energies  contained  in 
the  other  coefficients  are  much  smaller.  Using  the  bit  allocation  scheme  defined  in  the 
previous  chapter  leads  to  a  problem  because  of  this  uneven  energy  distribution  and  the 
high  variance  ofxj  (See  Table  5.1);  all  bits  are  assigned  to  wavelet  coefficients  Xj  leaving 
no  bits  to  represent  the  other  wavelet  coefficients.  If  we  discard  the  other  coefficients  and 
encode  only  Xj,  this  causes  another  problem.  The  edge  information  of  the  image  is  mostly 
contained  in  these  coefficients,  and  the  edges  in  the  image  are  lost  if  we  discard  all  of 
these  coefficients.  In  order  to  allocate  some  bits  to  quantize  d„,  we  first  reserve  a  large 
number  of  bits  for  Xj  taking  the  computational  cost  into  account  and  share  the  remaining 
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Figure  5.2  Energy  Distribution  of  Wavelet  Coefficients  Normalized  to  the  Original 
Signal  at  Resolution  Level  m=0. 


bits  among  by  using  the  bit  allocation  algorithm  of  Eq.  4.2.  The  wavelet  coefficients 
d and  (f ,  are  discarded  due  to  their  low  energy  values.  Keeping  the  average  number 
of  bits  zx  B  =  1  bit  per  vector,  we  reserve  II  bits  for  coding  of  Xj  and  allocate  the 
remaining  bits  to  the  rest  of  the  wavelet  coefficients.  Table  5.3  shows  the  bit  allocations 
for  all  wavelet  coefficients. 

Table  5.3  BIT  ASSIGNMENTS  TO  WAVELET  COEFFICIENTS 


Wavelet  Coefficients 


mailiBIBiBllBIl 


Codebook  Size 


0  0  0  3  3  0  8 
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2.  Vector  Quantization  of  Wavelet  Coefficients 

Given  the  number  of  bits  for  each  set  of  wavelet  coefficients,  we  construct 
the  codebooks  by  using  the  GL  algorithm.  In  order  to  compensate  for  the  decrease  in  the 
number  of  coefficients  at  lower  resolution  levels,  we  use  the  same  approach  introduced 
in  Section  C.2  of  Chapter  IV.  Consider  the  signal  at  resolution  level  m  which  has 
roughly  one  quarter  of  the  samples  of  that  at  resolution  level  m-1.  First,  we  decompose 
the  level  m-1  signal  into  wavelet  coefficients.  The  resulting  four  wavelet  coefficient  sets 
at  level  m  are  retained.  Now,  we  shift  the  level  m-1  data  by  one  sample  horizontally  and 
repeat  the  DWT  decomposition.  The  resulting  wavelet  coefficients  are  combined  with 
those  already  obtained  at  level  m  to  extend  the  training  set.  This  same  procedure  is 
repeated  using  vertical  and  diagonal  shifts.  The  resulting  data  set  at  level  m  has  roughly 
the  same  size  as  that  at  level  m-1. 

After  constructing  the  codebooks  using  the  training  data  obtained  as  above, 
we  encode  the  wavelet  coefficients  using  the  corresponding  codebooks.  In  the 
reconstruction  procedure,  we  substitute  zeros  for  the  discarded  wavelet  coefficients. 

C.  SIMULATION  RESULTS 

Simulations  were  carried  out  by  using  two  images: 

•  A  training  image  of  512-by-512  pixels  (Fig.  5.3)  constructed  from  pieces  of  four 
different  images  with  a  resolution  of  8  bits  per  pixel  (bpp). 

•  A  test  image  (Lenna)  of  512-by-512  pixels  shown  in  Fig.  5.4. 

The  test  image  was  not  contained  in  the  training  data. 


The  coding  performance  is  measured  by  using  the  peak  signal-to-noise  ratio  (PSNR) 
defined  as  the  ratio  of  square  of  the  peak  input  amplitude  to  the  mean  square  error,  D, 
[Ref  8): 

P5iVK  =  10Iog,o(^)  dB  (5.1) 

where: 

N  N 

D  ^d{x,y)  =  ^53  Y.  ■ 

and  N  is  the  number  of  pixels  along  one  side  of  the  (square)  image. 

Figure  5.5  shows  the  performance  of  WTC  in  terms  of  SNR  vs.  bit-rate,  in  bits  per 
pixel.  We  changed  the  number  of  pixels  in  each  vector  to  obtain  different  bit  rates  while 
keeping  the  number  of  bits  for  each  quantizer  fixed  as  in  Table  5.3.  The  test  image  was 
also  coded  by  DVQ  for  comparison.  The  dotted  line  in  Fig.  5.5  represents  the 
performance  of  DVQ. 

Figure  5.6  shows  the  WTC-coded  test  image  at  0.25  bpp  with  an  SNR  of  27.37  dB 
while  Figure  5.7  shows  the  same  image  coded  by  DVQ  at  0.25  bpp  with  an  SNR  of  20.7 
dB.  The  performance  of  WTC  is  very  good  for  a  compression  rate  of  32: 1 ;  however,  the 
picture  quality  is  effected  by  blurred  and  jagged  diagonal  edges.  This  is  caused  by  the 
smoothing  effect  of  the  DWT,  which  produces  a  positive  effect  in  speech  coding.  Using 
vectors  of  small  size  helps  to  preserve  the  edges;  Figure  5.8  shows  the  test  image  coded 
by  WTC  with  a  vector  size  of  2  pixels  (0.5  bpp)  and  an  SNR  of  29.5  dB.  The  quality 
of  the  coded  image  is  nearly  as  good  as  the  original. 
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Figure  5.4  Test  Image  (Lenna)  of  512x512. 


\ 
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Table  5.4  illustrates  the  contribution  of  each  wavelet  coefficient  set  of  Figure  5.8 
to  the  overall  distortion  with  the  bit  rates  normalized  by  (see  Eq.  4.2  and  4.4).  It  is 
interesting  that  the  contributions  of  various  wavelet  coefficients  to  the  overall  distortion 
are  by  no  means  equal.  It  ranges  between  MSE  =  8.24  for  d^,  and  MSE  =  172.52  for 
.r  j.  Notice  the  high  distortion  in  Xj  despite  the  high  number  of  bits  per  pixel  used  in  the 
allocation.  This  underscores  the  importance  of  allocating  a  large  number  of  bits  for  the 
codebook  of  jc,. 


Table  5.4  BIT  ALLOCATIONS  AND  MSE  CONTRIBUTIONS  OF  WAVELET 
COEFHCIENTS  OF  THE  IMAGE  IN  HG.  5.8 


d'. 

d^. 

d^ 

d': 

d'2 

Bit  Rate  (bpp) 

0 

0 

0 

0.1013 

0.1013 

MSE 

125.76 

131.07 

8.24 

147.21 

171.41 

d'j 

■■ 

d^ 

X3 

Bit  Rate  (bpp) 

0 

0.0748 

0.0748 

0.0374 

0.1029 

MSE 

73.10 

67.08 

97.38 

105.72 

172.52 

We  also  coded  the  test  image  by  using  different  vector  sizes.  Since  most  of  the 
energy  was  contained  in  x^,  we  used  a  vector  dimension  of  2  pixels  for  and  8  pixels 
in  each  vector  for  the  d  coefficients.  The  codebook  sizes  were  kept  the  same  as  in  Table 
5.3.  The  resulting  bit  rate  was  0.2  bpp  with  an  SNR  of  27.15  dB.  The  reconstructed 
image  is  shown  in  Fig.  5.9.  A  comparison  of  Fig.  5.6  and  Fig.  5.9  shows  that  proper 
use  of  different  vector  sizes  can  result  in  lower  bit  rates  with  better  visual  quality.  We 
say  "bf'tter  visual  quality"  instead  of  "better  SNR"  because  the  quality  of  the  image  in 
Fig.  5.9  looks  superior  to  the  one  in  Fig. 5. 6  although  the  SNRs  are  quite  close. 
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Figure  5.8  Lenna  Coded  by  WTC  at  Figure  5.9  Lenna  Coded  by  WTC  at 

0.5  bpp,  SNR  =  29.5  dB.  0.2  bpp  by  Using  Different  Vector 

Sizes,  SNR  =  27.42  dB. 


Figure  5.6  Lenna  Coded  by  WTC  at 
0.25  bpp.  SNR  =  27.37  dB. 


Figure  5.7  Lenna  Coded  by  DVQ  at 
0.25  bpp,  SNR  =  20.7  dB. 


For  different  simulations,  the  improvement  in  performance  of  WTC  over  DVQ  in 
terms  of  SNR  ranged  from  17  to  50  percent  corresponding  to  bit  rates  of  between  0. 16 
and  0.5  bpp.  The  use  of  different  vector  sizes  for  different  wavelet  coefficients  provided 
better  image  quality  compared  to  the  case  of  uniform  vector  size  for  all  wavelet 


coefficients. 


VI.  CONCLUSIONS 


In  this  thesis,  we  combined  algorithms  for  discrete  wavelet  transforms  and  vector 
quantization  to  develop  coding  schemes  for  speech  and  image  data.  We  studied  wavelets 
based  on  both  orthogonal  and  biorthogonal  bases;  we  used  biorthogonal  wavelet  functions 
for  image  coding  since  filters  with  linear  phase  are  desired  in  image  coding. 

The  algorithm  developed  for  the  discrete  wavelet  transform  (DWT)  has  a  pyramidal 
decomposition  architecture  and  is  based  on  the  convolution  of  data  in  each  pyramid  level 
with  wavelet  and  scaling  filters.  Unlike  those  in  traditional  Fourier  theory,  the  basis 
functions  in  the  wavelet  approach  are  formed  by  dilating  and  translating  a  single 
prototype  function.  Given  the  coefficients  of  basis  filters  with  infinite  precision,  the  DWT 
reconstructs  the  original  data  without  any  error. 

Vector  quantization  was  shown  here  to  yield  better  performance  when  combined 
with  the  DWT.  The  results  as  measured  in  SNR  show  that  the  p>erformance  has  improved 
by  17%  to  84%  over  direct  vector  quantization  (DVQ)  depending  on  the  number  of  bits 
per  sample  and  the  type  of  data.  In  speech  coding,  we  found  that  the  subjective  quality 
of  the  coded  speech  was  still  superior  to  that  using  DVQ,  even  though  the  improvement 
in  SNR  obtained  by  using  the  DWT  was  fairly  small.  This  shows  that  SNR  may  not 
always  be  a  good  measure  of  performance.  We  also  tested  the  same  algorithm  on 
transient  signals.  The  performance  of  WTC  in  this  case  well  exceeded  that  of  DVQ. 
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In  image  coding,  we  found  that  WTC  outperforms  DVQ  for  bit-rates  around  0.5 
bit  per  pixel  (bpp).  At  bit-rates  lower  than  0.5  bpp,  WTC  suffered  from  blurred  and 
jagged  edges  although  the  performance  was  still  better  than  DVQ.  This  problem  also 
existed  for  relatively  high  bit-rates,  but  it  was  not  as  predominant  as  it  was  at  low  bit- 
rates. 

The  major  coding  gains  are  due  to  encoding  vectors  of  samples  (vector 
quantization)  and  the  use  of  different  codebooks  for  wavelet  coefficients  at  different 
levels.  Although  using  different  codebooks  increases  the  complexity  and  the 
computational  cost,  this  research  has  shown  that  the  VQ-DWT  combination  improves  the 
quality  of  the  reconstructed  data  over  that  of  DVQ  for  a  given  bit  rate  (bps  or  bpp). 
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APPENDIX  A:  PROGRAM  DETAILS 


This  appendix  contains  the  program  listings  for  each  of  the  MATLAB  algorithms 
in  the  thesis.  The  codes  are  categorized  into  three  classifications,  1-D  DWT,  2-D  DWT 
and  VQ. 


A.  ONE-DIMENSIONAL  DWT  ROUTINES 

function  lowest  =  dwtl(cO,qqq, LOW, file, qq) 

%  DWT  I  One-dimensional  discrete  wavelet  transform 
%  DWT !(...)  computes  the  wavelet  coefficients  of  vector  data  CO 
%  by  using  a  wavelet  filter.  It  stores  the  wavelet  coefficients 
%  in  a  file  and  returns  the  lowest  resolution  level  that  the  data 

%  is  processed.  It  also  stores  the  filter  banks  to  avoid 

%  rebuilding  them  in  reconstruction  process. 

%  CO  =  Input  data  in  vector  format 

%  QOQ  =  Wavelet  filter  type 

%  LOW  =  Lowest  desired  resolution  level 

%  FILE  =  Output  file  containing  the  wavelet  coefficients 

%  QO  =  Filter  tap 

%  Alper  Erdemir 
%  June  1993 


cO  =  cO'; 


%  Get  the  filter  coeffients 

% . - . - 

if  qqq=  =  l 

h  =  coiflet(qq)'; 

wwavelet  =  l’Coif_',num2str(qq)|; 
elseif  qqq==2. 

h=daubdata(qq); 

wwavelet  =  [  ’  Daub'  ,num2str(qq)  1 ; 
else  h  =  |.5  .5|’;wwavelet  = 'HAAR'; 
end 

%  Normalize  the  filter  coefficients 


h  =  sqrt(2)*h: 


%  Intialize  some  of  the  constants: 
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% . — . 

LL  =  leDgth(cO); 

Nh  =  length(h); 

lowest  =  -ceil(log(LL)/log(2)); 

if  abs( lowest)  >  abs(LOW)  lowest  = -abs( LOW);  end 

%  Generate  the  g  coefficients 

% . - . . . . . 

g  =  flipud(h); 
for  i  =  2:2:Nh 
g(i,l)  =  -g(i,l); 
end 

WIN  =256;  %  window  size  for  data  longer  than  WIN 

%  The  following  code  constructs  the  filter  banks  for  the  scaling 
%  filter  and  the  wavelet  filter 

% . . 

if  LL  <  WIN 

H  =  fltbnk(h.2*ceil(LL/2)); 

G  =  fltbnk(g,2*ceil(LL/2)); 

else 

H  =  ntbak(h.WIN); 

G  =  fltbnk(g,WlN); 

end 

%  Decomposition  routine 

% . 

for  lvl  =  -l:-l:lowest 
Ivl 

e val(l  c  ’  ,num2str(abs(l vl)) ,  '=dcmpr,’(c’.... 
num2str(abs(lvl)-l),’,H,Nh.WIN);'|) 
eval(|’d',num2str(abs(lvl)),'  =:dcmpl 

num2str(abs(lvl)-l),’,G,Nh,WIN);’|) 

end 

%  Store  the  wavelet  coefficients 

% . -  - - - 

temp  =  (|; 

for  n  =  1  :abs(lowest) 

eval(l’temp  =  |temp  ”d’,nura2str(n),’  c\int2str(n).’  ”l;’l) 

end 

evald’save  ’,file.’_’,wwavelet,'  cO  H  G  WIN  wwavelet  Nh  '.tempi) 
%  End  of  routine 

% . . . - - - 


function  x  =  fltbnk(filt,L) 

%  FLTBNK  Construction  of  filter  matrices  for  DWT 
%  FLTBNK(FILT,L)  constructs  the  filter  matrix  using  the  filter  FILT. 
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%  This  routine  is  also  used  for  2-D  DWT. 

%  FILT  =  Filter 

%  L  =  Length  of  data  to  be  processed 

%  Alper  ERDEMIR 
%  June  1993 

F  =  length(filt); 

Lc  =  L  +  2*F^; 

Ll  =  (Lc-F)/2+l; 

x  =  zeros(Ll,Lc); 
for  n  =  1 :  LI 

x(n,:)  =  l2eros(l,2*(n-l))  filt’  zeros! l,Lc-F-2*{n-l))|; 

end 

%  End  of  routine 

% . . . . 


function  x  =dcmpl(data,F,lF,W) 

%  DCMPl  One-dimensional  DWT  decomposition  routine. 

%  DCMP1(...)  conducts  the  DWT  decomposition  of  the  input  DATA 

%  by  usmg  the  filter  bank  F  with  a  wmdow  size  of  W. 

%  DAT  A  =  Input  data 

%  F  =  Wavelet/Scaling  filter  matrix 

%  IF  =  Wavelet/Scalmg  filter  length 

%  W  =  wmdow  length  for  relatively  long  data 

%  Alper  ERDEMIR 
%  June  1993 

L  =  length(data); 

LF  =  IF-2; 


%  The  following  four  lines  checks  if  the  data  vector  is  odd.  if  it  is 
%  true,  then  a  zero  is  added  to  it. 

% . - . . . — . 

if  rem(L/2,floor(L/2))  ~  =  0 
data  =  |data;0|; 

L  =  L+I; 

end 

N  =  rix(L/W); 

%  Add  LF  zeros  to  the  beginning  and  end  of  DATA 

%- . . . - - - 

data  =  |zeros(LF,  1  );data;zeros(LF,l)|; 

%  Decomposition  routine 

%  - . . . -  - . . - . 

LLL  =  length(data); 

LL  =  (LLL-IF)/2+  1; 
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if  N  >  =  1 

Ul=(W+2*LF-IF)/2  +  l; 

x(l;Lll)  =  F(l:Lll,l:W  +  2*'  P)*daU(l:W  +  2*LF); 

LI2  =  (WfLF-lF)/2+l; 
if  N>  =2 

for  n  =  I :  N  - 1 

x(Lll  +(n-l)»U2  +  l:Lll  +n»U2)  =  F{l:LI2,l:W-hLF)... 

*data(LF  +  n*W  +  1  :(n+  l)*W  -!-2*LF); 

end 

end 

REM  =  rem(L,W); 

if  REM  ~  =  0 

Lrem  =  (REM  LF-IF)/2  +  1 ; 

x(LL-Lrein+  l:LL)  =  F(l:Lrem,I:REM  +  LF)*data(LLL-REM-LF+  1:LLL); 

end 
x  =  x’; 

else 

X  =  F(  I ;  LL,  1 ;  LLL)*data: 
end 

%  End  of  routine 

% . 


function  ses  =  idwtl(file) 

%  IDWTl  One-dunensional  inverse  discrete  wavelet  transform 
%  IDWTl  (FILE)  returns  the  reconstructed  data  SES  by 
%  taking  the  inverse  wavelet  transform  of  wavelet  coefficients 
%  stored  in  FILE  DWT. 

%  Alper  ERDEMIR 
%  June  1993 

%  Lode  the  wavelet  coefficients 

% . 

evald'load  ’.file,’_dwt'|) 

%  Start  the  reconstruction  routine 

% . - . - . . . 

eval(l'ses  =  c',num2str(lowest),';’|) 

FF  =  Nh-2; 

FFF  =  floor((FF-l)/2+  1); 

WINl  =WIN-2*FF; 
for  lvl  =  -lowest:  I  :-l 

L  =  length(ses); 

WIN2  =  floor!  (WIN  1  +  Nh- 1  )/2); 

N  =  rix((L-FFF)/WIN2); 

eval(| 'a  =  length(d’.num2str(-lvl-l ),’);'!) 

if  N=  =0 

Lu  =  (L-l)*2  +  Nh; 

eval(|’dwork  =  G(l  :L,I  :Lu)”*d',num2str(-lvl),’;'|) 
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ses  —  H(  1  :L,  1  :Lu)'  •ses  +  dwork; 

Lp=  length(ses)-Nh  +  2; 
ses  — ses(Nh-2 -f  l:Lp); 
ses  =  ses(  1  :a); 

else 

Lu=-(WIN2-1)*2  +  Nh; 
e  val(  I'd  work  -G(  1  :WIN2,1  :Lu)'’*d’,... 

num2str(-lvl),’(  I :  W1N2);'|) 
cwk(  I  :Lu)  -  H(  1 :  W1N2, 1  :Lu)’*ses(  1  :WIN2)  +  dwork; 

Lp-  length(cwk)-FF; 
cwk  cwk(Nh-2  +  1  :Lp); 

NN  =  length(cwk); 
if  N  >  -  2 

Lu2  =  Lu+FF; 

W1N.1  =  WIN2  f2*FFF; 
for  n  =  I  :N  - 1 

eval(|  'dwork  =  G(  1 :  WIN3,l:Lu2)"*d',... 

num2str(-lvl).'((n-l  )*’V1N2-FFF  +  W1N2  +  I  :n’‘WIN2  +  FFF  +  W1N2);'  |) 
cwk2  —  Fl(  I :  WIN3,1  :Lu2)’*ses(n*... 

WIN2-FFF  +  1  ;(n  +  1  )»W1N2  +  FFF)  +  dwork; 

Lp2  =  length(cwk2)-FF; 

cwk(NN  -^(n-1  )*(Lp2-FF)  +  I  :NN  +n*(Lp2-FF))  =  cwk2(FF  +  1  :Lp2); 

end 

NN  =  length(cwk); 

end 

REM  =  L-N*'iVIN2; 
if  REM  -  =  0 

Lrem  =  (REM-I  )*2  +  Nh', 

eval(rdwork  =  G(  1  :{FFF  +  REM),I  :Lrein)"*d',... 

num2str(-lvl),’(L-REM-FFF  + 1  ;L);‘|) 
cwk2  =  Fl(  I  :(REM  +  FFF).  1  :Lrem)'*ses(L-REM-FFF+  1  :L)  +  dwork; 

Lp  =  length(cwk2); 

cwkiNN  +  1  ;NN  +  Lp-FF)  =  cwk2(FF  +  I  :Lp); 

end 

ses  =  cwk(l:a)'; 

end 

end 

%  End  of  routine 


B.  TWO-DIMENSIONAL  DWT  ROUTINES 

function  lowest  =dwt2(c0, fit, LOW. file. qq) 

%  DWT2  Two-dimensional  discrete  wavelet  transform 
“T  DWT2(...)  computes  the  wavelet  coefficients  of  two-dimensior»l 
%  data  CO  by  using  a  wavelet  filter  FLT  in  both  horizantal  and  vertical 

“T  directions.  It  stores  the  wavelet  coefficients  in  a  FILE  DWT  and 

%  returns  the  lowest  resolution  level  that  the  data  is  processed. 

%  Both  orthogonal  and  biorthogonal  decorapostion  can  be  done 
%  by  using  appropriate  filteis. 

T  CO  -  Input  data  in  vector  format 
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%  FLT  =  Wavelet  filter  type 

%  LOW  =  Lowest  desired  resolution  level 

%  FILE  =  Output  file  containing  the  wavelet  coefficients 

%  QQ  -  Filter  tap 

%  Alper  Erdemir 
%  June  l‘J‘)3 


%  Construct  2-D  filter  banks  for  decomposition 

% . - - - - - . 

lHH.HV.GH,GV,Nh,Nv|  =  filts2(cO,nt,qq.l); 

%  Determine  of  the  lowest  possible  resolution  level 

% . . -  - . . . . 

ILI  L2|  =size(cO); 

rsl  =  min(ceil(log(Ll  )/log(2)),ceil(log(L2)/log(2))); 
lowest  =  abs(rsl); 
if  lowest  >  LOW 

lowest  =  LOW ; 

end 

%  Decomposition  routme 

% . 

for  lvl=  1 : lowest 

eval(|’c',num2str(lvl),'  =dcmp2(c‘.... 
num2str(lvl-l  ),’,HH,HV,Nh.Nv),’|) 
eval(('dr.num2str(lvl),’  =  dcmp2(c'.... 
num2str(lvl-l),’.HH,GV,Nh.Nv):’|) 

eval([’d2’,num2str(lvl),'  =dcmp2(c',... 

num2str(lvl-l).’.GH,HV.Nh,Nv);'l) 
eval(|  ■d3’,num2str(lvl),'  =dcmp2(c',... 
num2str(lvl-l  ),’,GFI,GV,Nh,Nv);’|) 

end 

%  Store  the  wavelet  coefficients 

% . . 

temp  =  11; 
for  n=  1 :  lowest 

for  m  =  1 ;? 

eval(| 'temp  =  (temp  "d'.int2f .r(m),mt2str(n),’  "1;'|) 

end 

eval(l'temp  =  ltemp  ’'c’,int2str(n),'  ”|;'|) 

end 

evaUI 'save  '.file,'  dwt  fit  qq  '.tempi) 

%  End  of  routine 

% . - - - - - - 
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function  lHH.HV,GH,GV,Nh,Nvl  =  filts2(cO,nt,qq,dir) 

%  FILTS2  construction  of  2-D  filter  banks 

%  F1LTS2  returns  the  2-D  filter  matrices  using  horizantal  and 
%  vertical  wavelet  and  scalmg  filters.  It  also  returns  the  filter 
%  lengths. The  available  filter  types  are  Daubechies  I  -  to  10-tap 
%  filters,  Haar  filter,  and  biorthogonal  1-  to  5-tap  filters. 

%  CO  =  Input  data  m  matrix  format 

%  FLT  =  filter  type  (for  both  directions; 

%  QQ  -  Filter  tap 

%  DIR  =  Indicator  for  decomposition  and  reconstruction  procedure 

%  (used  for  biorthogonal  filters) 

%  Alper  ERDEMIR 
%  June 

%  Get  the  filter  coefficient 

% . . . 

if  tit  =  =-  I 

Hh  =  daubdata(qq); 
elseif  fit  =  =  2 

Hh  =  |.5  .51’; 
elseif  fit  =  =  .1 

|Hh,Hh2|  =  biort(qq); 

Hh  =10;Hh|; 

Hh2  =  10;Hh2|; 

Hv2  =  Hh2; 

end 

Hv  =  Hh;  %  Let  vertical  filter  equal  to  horizantal  one 

Nh  -  length(Hh); 

Nv  =  length(Hv); 

%  Generate  the  g  coefficients 

% . - . 

if  tit  <  -  2 

Gh  =  flipud(Hh); 
for  i  =  2:2:Nh 

Gh(i,l)  =  -Gh(i.l); 

end 

Gv  =  flipud(Hv); 
for  i  =  2:2:Nv 

Gv(i,l)  =  -Gv(i,l); 

end 

else 

if  dir  =  =-  I 

Gh  =  nipud(Hh2); 
for  i-2:2:Nh 

Gh(i.l)  -  -Gh(i,l); 

end 

Gv  -  flipud(Hv2); 
for  I  -2:2:Nv 
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Gv(i,l)  =  -Gv(i,l); 


end 

else 

Gh  =  flipud(Hh); 
for  i  =  2:2;Nh 

Gh(i.l)  =  -Gh(i,l); 

end 

Gv  =  flipud(Hv); 
for  i  =  2;2:Nv 

Gv(i,l)  =  -Gv(i,l); 
end 

Hh-Hh2;Hv  =  Hv2: 
end 

end 

|drowO,dcolO|  =  size(cO); 

%  Construct  the  filter  banks 

% . - . . . 

HH  =  sqrt(2)*fltbnk(Hh,2*ceil(dcolO/2)); 
HV  =  sqrt(2)*fltbnk(Hv,2*ceil(drowO/2)); 
GH  =sqrt(2)*fltbnk(Gh,2*ceil(dcolO/2)); 
GV  =  sqrt(2)*fltbnk(Gv,2*ceil(drowO/2)); 

%  End  of  routine 

% . 


function  x=dcmp2(data,Fh,Fv,lFh,lFv) 

%  DCMP2  Two-dimensional  DWT  decomposition  Routine 
%  DCMP2(...)  conducts  the  DWT  decomposition  of  the  input  DATA 

%  by  usmg  horizantal  and  vertical  filter  banks,  FH  and  FV, 

%  DATA  =  Input  data 

%  FH  =  Horizantal  wavelet/scaling  filter  matrix 

%  FV  =  Vertical  wavelet/scaling  filter  matrix 

%  IF  =  Horizantal  wavelet/scalmg  filter  length 

%  IF  =  Vertical  wavelet/scalmg  filter  length 

%  Alper  ERDEMIR 
%  June  1993 

ILv  Lh|  =  size(data); 

%  If  the  horizantal  length  of  the  data  is  odd.  add  a  column  of  zeros 


if  rem(Lh/2,floor(Lh/2))  ~  =  0 
data  =  (data  zeros(Lv,l)|; 

Lh  =  Lh-(-l; 

end 

%  Add  IFh-2  columns  to  left  and  rigth  of  the  data 


62 


% . . - . . 

data  =  lzeros(Lv,IFh-2)  data  zeros(Lv,lFh-2)|; 

vlen-Lh  +  2*(IFh-2);  %  Width  of  the  Hor.  Filt.  Matrix 

hlen  =  (vlen-lFh)/2  +  1 ;  %  Fleigth  of  the  Flor,  Filt.  Matrix 

%  Fionzantal  decomposition  routme 

% . .  - . - 

xh  =  (Fh(  1  :hlen,  1  :vlen)*data’)’; 

%  If  the  horizantal  length  of  the  data  is  odd.  add  a  column  of  zeros 

%  . . -  - - - - - 

if  rem(Lv72,floor(Lv/2))  ~  —  0 
xh  -  (xh;zeros(  I  ,hlen)|; 

Lv  =  Lv  +  I; 

end 

%  Add  IFh-2  columns  to  left  and  rigth  of  the  data 

% . - . - . . . - 

xh  =  |zeros(IFv-2,hlen);xh;zeros(lFv-2,hlen)|; 

vlen2  =  Lv  +  2*(lFv-2);  %  Width  of  the  Vert.  Filt.  Matrix 

hlen2  =  {vlen2-IFv)/2+ 1;  %  Heigth  of  the  Vert.  Filt.  Matrix 

%  Vertical  decomposition  routine 

% . 

X  =  Fv(  I  :hlen2. 1 :  vlen2)*xh; 

%  End  of  routine 

% . 


%function  im  =  idwt2(file) 

%  IDWT2  Two-dimensional  mverse  discrete  wavelet  transform 
%  IDWT2(FILE)  returns  the  reconstructed  image  IM  by 
%  taking  the  inverse  wavelet  transform  of  wavelet  coefficients 
%  stored  in  FILE  DWT. 

%  Alper  ERDEMIR 
%  June  199.1 

%  Load  the  wavelet  coefficients  and  construct  the  filter  matrices 

% . . . . 

%eval(|'c0  =  load2(”'. file," ’);’]) 

%eval(|’load  ’.file,’  dwt'|) 
if  -exist(’HH’) 

[HH,HV,GH.GV,Nh.Nv|  =  filts2(c0.flt.qq,2); 

end 

%  Reconstruction  routme 

% . - - - 
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evaUl’im  =  c’,num2str(lowest),’;’|) 
for  lvl  =  lowest:-l :  1 

cwrk  =  rctver2(iin,HV,Nv); 

eval(|  'dwrkl  =  rctver2(dl  ',nuin2str(lvl),’.GV,Nv);‘|) 
datvl  =cwrk+dwrkl ; 

evaUl  ’dwrk2  =  rctver2(d2’,num2str(lvl).‘,HV,Nv);’|) 

eval(|  'dwrk3  =  rctver2(d3’,num2str(lvl),',GV,Nv);’|) 

dat  v2  =  d  wrk2  +  d  wrkJ ; 

dathl  =rcthor2(datvl  ,HH,Nh); 

dath2  =  rcthor2(datv2.GH,Nh); 

im=  dathl  fdath2; 

|Lv  Lh|  =  si2e(iin); 

im  =  iin(Nv-2+  l:Lv,Nh-2  +  l:Lh); 

if  Ivl  -  =1 

eval(|'|al  a2|  =  size(dl  ',Qum2str(lvl-l),’);’|) 
else  |al  a2|  =  size(cO); 
end 

ini  =  iin(  I  :al  ,1  :a2); 


%  End  of  routine 


function  x  =  rcthor2(data,Fh,lFh) 

%  RCTHOR2  conducts  the  2-D  horizantal  reconstruction  of 
%  wavelet  and  approximation  coefficients. 

%  DATA  =  Input  data 

%  FH  =  Horizantal  wavelet/scaling  filter  matrix 

%  LFH  =  Horizantal  wavelet/scalmg  filter  length 

%  Alper  ERDEMIR 
%  June  1993 

|Lv  Lhl-size(data); 
vlen  =  (Lh-l)*2  +  IFh; 

%  do  the  horizantal  decomposition 

%  - - - - - .  - 

X  =  (Fh(  I  :Lh,  I  :vlen)'*data')'; 

%  End  of  routine 

% . - . - - - - - 


function  x  =  rctver2(data,Fv,IFv) 

%  RCTVER2  conducts  the  2-D  vertical  reconstruction  of 
%  detail  and  approximation  coefficients. 

%  DATA  =  Input  data 

%  FV  =  Vertical  wavelet/scaling  filter  matrix 
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%  LFH  =  Vertical  wavelet/scaling  filter  length 

%  Alper  ERDEMIR 
%  June  199.1 

|Lv  Lh|  -  size(data); 
vien  =  (Lv-I  )*2  +  IFv; 

%  do  the  vertical  decomposition 

% . - . — 

X  -  Fv(  I  :Lv,  1 :  vlen)’*data; 

%  End  of  routine 

% . . . . 

C.  VQ  ROUTINES 

function  w  =  initcb(x,B) 

%  INITCB  Initialization  of  the  codebook  for  VQ 
%  IN1TCB(X,B)  initializes  the  codebook  randomly  selectmg  data 
%  vectors  from  the  subject  data  set  X. 

%  X  -  Data  set 

%  B  =  Total  number  of  bits 

%  Alper  Erdemir 
%  June  1993 

N  -2*8;  %  Find  the  codebook  size 

%  Randomly  select  the  codewords  from  X 

% . 

rand( 'uniform’) 

Nx  =  max(size(x)); 
for  I  =  1  :N 

w(:,i)  =  x(:.ceil(Nx*rand(  1 ))); 
end 

%  End  of  routine 

% - - - - 


function  x  =dat2vec(y.M) 

%  DAT2VEC  Conversion  of  data  format  to  be  used  in  VQ 
%  DAT2VEC(Y,M)  converts  the  mput  data  into  the  VQ  format 
%  Y  =  Input  data  of  Nxl  (for  1-D)  or  NxN  (for  2-D)  where  N  is 

%  data  length 

%  M  =  vector/block  size 

%  Alper  ERDEMIR 
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%  June  I99,J 


N  ^sizejy); 
nl  =  M(  1 ); 

%  if  input  data  is  1-D 


,f  N{l)  1  1  N(2)  ==  1 
in_hei  =  length(y); 
nla  =  ceil(ui_hei/nl ); 

y=y(:); 

X  -  ly;zeros(nl  ’nla-inhei,  I  )|; 

X  =  reshape(  x  ,n  I  ,n  I  a) : 

%  else,  input  is  2-D 

% . -  - . . 

else 

n2  =  M(2); 

nla  =  ceil(N(  1  )/nl ); 

n2a  =  ceil(N(2)/n2); 

X  =  zeros(n  1  •n2,n  1  a*n2a); 
yy  =  zeros(n  1  a*n  1  ,n2a*n2); 
yy(l:N(l),l:N(2))=y; 
for  i=  l:nla 

il=(i-l)*nl  +  l; 
for  j  =  I  :n2a 

k  =  (i-l)*n2a  +j; 
ji  =0-J)*n2  + J; 
z  =  yy(il:il  +nl-l,jl:jl  +n2-l); 
x(:,k)  =  z(:): 

end 

end 

end 

%  End  of  routme 

% . 


function  x  =  vec2dat(y.N,BLK) 

%  VEC2DAT  conversion  of  the  data  from  VQ  format  to  original  format 
%  VEC2DAT(Y,N.BLK)  converts  the  data  Y  in  VQ  format  to  original 
%  format  of  size  Nxl  (for  1-D)  or  size  NxN  (for  2-D). 

%  Y  =  Subject  data  in  vector  format 

%  N  =  Size  of  onginal  data 

%  BLK  =  Vector/block  size 

%  Alper  Erdemir 
%  June  1993 

%  if  data  is  1  -D 

% . - . . . 
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if  nargui  =  =  2 

y  =  y(:); 

x-y(l;N); 

%  else  It  IS  2-D 

% . - . 

else 

nla  =  eeil(N(l)/BLK(l)); 
n2a  =  ceil(N(2)/BLK(2)): 

X  =  zeros(BLK(  1  )*N(  1  ),BLK(2)*N(2)); 
for  I  -  I  ;nla 

il  =(i-l)*BLK(l)+l; 
for  j  =  I  :n2a 

k  =  (i-l)*n2a+j; 
jl  =(j-l)*BLK(2)  +  1; 
z  =  zeros(BLK(  1  ),BLK(2)); 
for  1=  1:BLK(2) 

in  =  (l-l)*BLK(l)+l; 
z(:,l)  =y(iii:in  +  BLK(1)-1  .k); 

end 

x(il:il  +BLK(l)-l,jl:jl  +BLK(2)-I)  =  z; 

end 

end 

x  =  x(l:N(l),l:N(2)); 

end 

%  End  of  routine 

% . 


function  |w.m|  =  lbg(tr,bit,BLK) 

%  LBG  conducts  the  generalized  Lloyd's  algorithm 

%  LBG(TR,BIT,BLK)  returns  the  trained  codebook  for  VQ.  The  code  first 
%  initializes  the  codebook,  then  trams  it  by  using  Lloyd's  iteration. 

%  The  traming  stops  when  the  criteria  equals  to  0.01. 

%  TR  =  Training  data 

%  BIT  =  Total  number  of  bits  to  costruct  the  codebook 

%  BLK  =  Vector/block  size 

%  Alper  Erdemir 
%  June  1993 

x  =dat2vec(tr,BLK); 

%  Initialize  the  codebook 

% . . . . . 

w  =  mitcb(x.bit); 
m  =  ||; 

N  =  0; 

%  Do  the  Lloyd's  iteration  until  convergence 
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% . — . 

for  n  =-■  1 :2 

w  =  lloyd(x,w); 
m  =  mse(x,w,ni); 
save  inf  m  w 

N  =  N+  1; 
end 

while  I 

w  =  lloyd(x,w); 
m  =  mse{x,w,m); 
save  inf  m  w 

N=N+1; 

if abs((m(N-l)-ni(N))/m(N-l))  <=  0.01 
brtak; 

end 

end 

%  End  of  routine 

% . - .  - 


function  wup  =  lloyd(x,w) 

%  LLOYD  Lloyd’s  iteration  for  VQ 

%  LLOYD(X,W)  performs  the  Lloyd  iteration  for  codebook  improvement 

%  to  the  given  codebook  W  by  using  the  training  data  set  X. 

%  X  =  T  raming  data 

%  W  =  Previous  codebook 

%  Alper  ERDEMIR 
%  June  1993 

L  =  size(x); 

N  =  stze(w); 

wup  =  zeros(N(l),N(2)); 
u  =  zeros(N(2).l): 
y  =ones(  1  ,N(2)); 

%  Partition  the  trammg  set  into  clusters  by  using  NNC 

% . - . . . . . 

for  k=l:L(2) 

d  =  sum((x(:,k)*y-w).*2); 
lna,iw|  =  min(d); 
vvup(:,iw)  =  wup(;,iw)  +  x(;,k); 
u(iw)  =  u(iw)+  1; 

end 

%  Take  care  of  the  empty  clusters  (  <  =3  training  vectors) 

%  by  splittmg  the  centroid  of  the  clusters  with 
%  the  highest  numbers  of  traming  vectors 

% . -  - . - 

i  =  find(u=  =0); 
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u(i)  =  ones(size(i)); 

wup  =  wup./(u*ones(  1  ,L(  I )))’; 

i  =  find(u<4); 

|Y  l|=sort(u); 

I  =  flipud(I); 
for  n=  l:length(i) 

?ps  =  norin(wup(;,l(n)))*.OOI ; 
wup(:,i(n))  =  wup(:,l(n))  +  eps; 
wup( : ,  l(n))  =  wup( ;  ,l(n))-eps; 
u(i(n))  =  u(l(n)); 


%  End  of  routine 


function  tn  =  tnse(x,w,iii) 

%  MSE  Computation  of  mean-square  error  of  a  codebook 
%  MSE(X,W.M)  computes  the  mean-square  error  of  the  given  codebook 
%  W,  and  returns  it  by  appending  the  current  value  to  an  existing 
%  vector  M,  which  has  the  MSE  record  for  each  iteration  of 
%  the  codebook 
%  X  =  T ramuig  data 

%  W  =  Current  codebook 

%  M  =  Vector  contammg  the  MSE  record 

%  Alper  Erderair 
%  June  1993 

N  =size(w); 

Nx  =  max(size(x)); 
msel  =0; 
y  =ones(  I  ,N(2)); 

for  k=l:Nx 

d  =  sum((x(;,k)*y-w),*2); 
mse  1  =  mse  1  +  min(d); 

end 

mse  1  =  mse  1  /(Nx*N(  1 )); 
m  =  |m,msel  |; 

%  End  of  routme 

% . -- . . 


function  |z.mse|  =  code(x,w) 

%  CODE  coding  of  data  set  X  using  codebook  W 

%  CODE(X,W)  returns  the  encoded  data,  Z,  and  the  mean-square  error 
%  of  coding. 
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% 


X  =  Data  set  to  be  coded 
W  =  Codebook 


%  Alper  ERDEMIR 
%  June  1993 

N  =size(w); 

Nx  =  inax(size(x)); 
mse  =  0; 

y  =ones(  1  ,N(2)); 
z  =  zeros(N(l),Nx); 

%  Find  the  closest  code  vector  to  each  input  vector 

% . - . - . -  -  - 

for  k  =  l:Nx 

d  =  suin((x(:,k)*y-w),^2); 

|ind.iwl  =  min(d); 
z(;,k)  =  w(:.iw); 
tnse  =  mse  +  md; 
end 

%  Find  the  mean-square  error 

% . 

mse  =  mse/(Nx*N(  1 )); 

%  End  of  routine 

% . 
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